Multi-Domain by rafapi · Pull Request #105 · ServiceNow/PipelineRL

rafapi · 2025-11-17T19:43:23Z

Enables simultaneous training across multiple domains (math, coding, function calling) with domain-agnostic orchestration

Architecture

Component	Role
multidomain/loader.py	Parses :: syntax, concatenates datasets, injects domain field into each sample
dispatcher.py	Routes problem["domain"] → domain-specific rollout callable via actor.domain_rollouts mapping
domain_sampling.py	Weighted sampling with adaptive rebalancing based on completion ratios to maintain target mix despite varying rollout latencies

Configuration

  actor:
    domain_mix:        # sampling weights (normalised at runtime)
      math: 0.4
      coding: 0.3
      fn_calling: 0.3
    domain_rollouts:   # domain → rollout function mapping
      math: pipelinerl.domains.math.generate_math_rollout
      coding: pipelinerl.domains.coding.generate_coding_rollout
    domain_system_prompts:  # per-domain system prompts
      coding: "You are an expert Python programmer..."

  train_dataset_names:
    - math::open_reasoner_zero_57k
    - coding::coding@train

Adaptive Sampling

DomainWeightedSampler adjusts weights dynamically: adjusted_weight = base_weight × (target_ratio / actual_ratio), clamped to [0.1, 10.0]. This compensates for domains with slower rollouts (e.g. coding sandbox execution) to maintain the configured mix in the output stream.

Metrics

Per-domain stats: domain_mix_actual/{domain}, domain_mix_target/{domain}, domain_mix_count/{domain}.

pipelinerl/domains/coding/rollouts.py

pipelinerl/domains/README.md

ehsk

LGTM overall except a couple of minor things!

ehsk · 2026-02-04T21:25:53Z

Here's also a sanity check to compare pre-multi-domain (displayed as ref below, orange in the top row and light blue at the bottom) vs. multi-domain on only MATH:

	Reward	Entropy
GRPO
GSPO

conf/coding.yaml

rafapi added 30 commits November 6, 2025 19:26

Add environment selector

cac78d7

Fix env launcher

df1d846

Adapt domains to env registry

9735130

Adapt domain configs

bb5e5ca

Collect env info

a1a02bf

Remove unrelated files

e4d0bc4

Remove backup

da43cbc

Remove duplicates

5b18001

Restore

9af7329

add domains

599b510

add coding

32eb5b8

Add remaining loaders

efaec65

Domain mix tracking metrics

1220f6d

update domain rollouts

158b2ea

refresh async llm flow

fe8e728

sync coding init

6f9c5cc

expand coding dataset

455ed42

remove legacy executor

795e490

revise coding rollouts

68072b1

adjust multidomain loader

981cb74

refresh preprocess pipeline

f7e6946

enhance utils helpers

73ca9d1

add multi domain config

38ff188

introduce domain sampling

2c5ebfd

add coding sandbox test

40cf648

implement verifier api

2c74b77

add symbolic init

cdfe57b

add symbolic dataset

a3c4106

add symbolic rollouts

7eef15e

remove deleted domains

cc091ac

rafapi added 20 commits February 4, 2026 12:04

remove redundant domain extraction from actor

a71eeb8

add explicit domain field to math rollouts

ada52ee

add explicit domain field to coding rollouts

3b0930e

add explicit domain field to fn_calling rollouts

41f9610

add explicit domain field to guessing rollouts

dbfdc3a

add explicit domain field to counting rollouts

24a2b1d

add explicit domain field to counting tapeagent

492bfcb

add explicit domain field to chartqa rollouts

30ce103

add explicit domain field to miniwob rollouts

7ab076a

add explicit domain field to deep_research rollouts

d13a485

remove actor.domain_mix null override from base

48d15db

remove actor.domain_mix null override

fce7011

let single-env configs fall back to env_replicas

3e180a7

restore cfg.environment for tapeagents compat

5a94da6

remove math from debug config (needs verifier)

9aace39

revert web.yaml to main (keep environment singular)

b39e9f3

add math env to debug multi domain config

b8c71c4

add math env to debug multi domain config

dc9dc2f

add missing _self_ to multi_domain base

76a9a84

remove domain_mix from base defaults

a423d76

ehsk reviewed Feb 4, 2026

View reviewed changes

pipelinerl/domains/coding/rollouts.py Outdated Show resolved Hide resolved

ehsk reviewed Feb 4, 2026

View reviewed changes

pipelinerl/domains/README.md Outdated Show resolved Hide resolved

ehsk approved these changes Feb 4, 2026

View reviewed changes

ehsk reviewed Feb 4, 2026

View reviewed changes

conf/coding.yaml Outdated Show resolved Hide resolved

rafapi added 4 commits February 4, 2026 21:51

make sandbox url required

6e54a7f

remove readme

14cd4f0

move sandbox config to top level in coding.yaml

45efba8

use top-level sandbox config, make endpoint required

5ad90c3

rafapi merged commit 39830c4 into main Feb 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Domain#105

Multi-Domain#105
rafapi merged 167 commits intomainfrom
multi-env

rafapi commented Nov 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ehsk left a comment

Uh oh!

ehsk commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rafapi commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architecture

Configuration

Adaptive Sampling

Metrics

Uh oh!

Uh oh!

Uh oh!

ehsk left a comment

Choose a reason for hiding this comment

Uh oh!

ehsk commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rafapi commented Nov 17, 2025 •

edited

Loading